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DETAILED ACTION 
Response to Arguments 

1 . Applicant's arguments with respect to claim 1 have been considered but are moot 
in view of the new ground(s) of rejection. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1, 2, and 10 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Huang et al. (U.S. Patent 5,913,193), in view of Hata et al. (U.S. Patent 
5,878,393). 

In regard to claim 1 , Huang et al. disclose a method for converting text to 
concatenated voice by utilizing a digital voice library and a set of playback rules, the 
digital voice library including a plurality of speech items and a corresponding plurality of 
voice recordings wherein each speech item corresponds to at least one available voice 
recording (Fig. 5), multiple voice recordings corresponding to a single speech item 
representing various inflections of that single speech item (a plurality of instances of a 
single speech item are stored, each having a particular variance in pitch and amplitude, 
i.e. "various inflections", column 7, lines 26-30) the method comprising: 

receiving text data (input text, column 7, lines 56-58); 



Application/Control Number: 09/818,207 Page 3 

Art Unit: 2626 

expanding the text data to form a sequence of text and pseudo words 
(abbreviated words and phrases are expanded to word phrases, column 7, lines 59-63); 

converting the sequence of text and pseudo words into a sequence of speech 
items in accordance with the digital voice library (steps 124-128, the word string is 
converted to a diphone string, column 8, lines 13-14, lines 24-26, and lines 51-52), 
wherein at least one speech item in the sequence of speech items corresponds to 
multiple voice recordings (each diphone is associated with a plurality of instances of that 
diphone in the database, see Fig. 6A-C and column 8, line 57 to column 9, line 1); 

converting the sequence of speech items into a sequence of voice recordings in 
accordance with the set of playback rules, wherein selecting a voice recording where 
multiple voice recordings are available for a speech item is based on a context around 
the speech item in the text data (a best diphone instance is selected based on the 
adjacent diphones, column 8, lines 57-62; see also Fig. 7 for selection rules); 

generating voice data based on the sequence of voice recordings by 
concatenating adjacent recordings in the sequence of voice recordings (step 132, the 
selected instances are concatenated, column 9, lines 49-57); 

wherein the plurality of speech items includes a plurality of phrases, and wherein 
converting the sequence of text and pseudo words further includes parsing the 
sequence of text and pseudo words to determine any phrases (the illustrative example 
utilizes diphones as speech items, however alternative units are disclosed, including 
phrases, column 4, lines 53-60 and column 1, lines 20-22). 
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Huang et al. further disclose alternative embodiments wherein speech items are 
selected so that different instances match well with adjacent units (i.e. represent various 
ligatures, column 7, lines 49-52). 

Huang et al. do not disclose establishing multiple voice recordings in the digital 
voice library that correspond to a single inflection of a single speech item, for a plurality 
of inflections of a plurality of speech items, that represent various ligatures for the single 
inflection of the single speech item with adjacent speech items. 

Hata et al. disclose a method for converting text to concatenated voice by 
utilizing a digital voice library, comprising establishing multiple voice recordings in the 
digital voice library that correspond to a single inflection of a single speech item, for a 
plurality of inflections of a plurality of speech items, that represent various ligatures for 
the single inflection of the single speech item with adjacent speech items (different 
intonations for a particular speech unit are stored, column 4, lines 28-36; then 
pronunciation variants are stored based on unit's adjacent neighbors, i.e. "various 
ligatures", column 4, lines 37-55). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. to further include multiple voice recordings for each 
single inflection of each single speech item that represented various ligatures for the 
single inflection of the single speech item with adjacent speech items, because this 
would refine the output speech quality by ensuring that regardless of what word 
proceeds or follows a given word and what the prosodic environment may be, the voice 
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library would contain a sample to match, as taught by Hata et al. (column 4, lines 37-40 
and lines 53-55). 

In regard to claim 2, Huang et al. disclose searching the text data for an 
abbreviation (abbreviated words are found, column 7, lines 56-60); and 

expanding any abbreviation contained in the text data into at least one pseudo 
word (words corresponding to the abbreviation, column 7, line 60 to column 8, line 12). 

In regard to claim 10, Huang et al. disclose the plurality of speech items includes 
a plurality of syllables, and wherein converting the sequence of text and pseudo words 
further comprising: parsing the sequence of text and pseudo words to determine any 
syllables (the illustrative example utilizes diphones as speech items, however 
alternative units are disclosed, including phrases, column 4, lines 53-60 and column 1, 
lines 20-22). 

4. Claims 3-5 and 9 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Huang et al., in view of Hata et al., and further in view of Trader et al. (U.S. Patent 
5,832,432). 

In regard to claim 3, Huang et al. disclose abbreviations, acronyms, character 
strings, and numerical strings are expanded into pseudo words (column 7, line 59 to 
column 8, line 12). 
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However, Huang et al. and Hata et al. do not specifically disclose expanding any 
numerical suffix contained in the text data into at least one pseudo word. 

Trader et al. disclose expanding the text data further comprises: 

searching the text data for a numerical suffix (Fig. 3e, step 148, engine phrases 
are located, the engine phrase including the numerical suffix "L", see step 80m); and 

expanding any numerical suffix contained in the text data into at least one 
pseudo word (the numerical suffix "L" is expanded to "litre engine", column 5, lines 26- 
30). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. and Hata et al. to search for and expand any numerical 
suffixes contained in the text data to at least one pseudo word, because expanding the 
abbreviated text makes the output speech more natural sounding, as taught by Trader 
et al. (column 1, line 63 to column 2, line 4). 

In regard to claim 4, Huang et al. disclose abbreviations, acronyms, character 
strings, and numerical strings are expanded into pseudo words (column 7, line 59 to 
column 8, line 12). 

However, Huang et al. and Hata et al. do not specifically disclose expanding any 
telephone number contained in the text data into at least one pseudo word. 
Trader et al. disclose expanding the text data further comprises: 
searching the text data for a telephone number (Fig. 3b, step 108, phone number 
patterns are found in the ad, column 4, lines 62-64); and 
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expanding any telephone number contained in the text data into at least one 
pseudo word (phone number information "555-1212" is expanded to "call Bob at 555- 
1212", see step 90c). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. and Hata et al. to search for and expand any telephone 
number contained in the text data into at least one pseudo word, because expanding 
the abbreviated text makes the output speech more natural sounding, as taught by 
Trader et al. (column 1 , line 63 to column 2, line 4). 

In regard to claim 5, Huang et al. disclose abbreviations, acronyms, character 
strings, and numerical strings are expanded into pseudo words (column 7, line 59 to 
column 8, line 12). 

However, Huang et al. and Hata et al. do not specifically disclose expanding any 
number that includes a comma contained in the text data to at least one pseudo word. 

Trader et al. disclose expanding the text data further comprises: 

searching the text data for a number that includes a comma (Fig. 3c, step 126, 
"42,000" is located in the ad, column 5, lines 8-11); and 

expanding any number that includes a comma contained in the text data into at 
least one pseudo word (number that includes a comma "42,000" is expanded to 
"42000", see step 90f). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. and Hata et al. to search for and expand any number 
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that includes a comma contained in the text data into at least one pseudo word, 
because expanding the abbreviated text makes the output speech more natural 
sounding, as taught by Trader et al. (column 1 , line 63 to column 2, line 4). 

In regard to claim 9, Huang et al. disclose alternate speech items (phonetic units 
of speech) are used (column 4, lines 53-60 and column 1 , lines 20-22). 

However, Huang et al. and Hata et al. do not specifically disclose the alternate 
speech items include a plurality of words. 

Trader et al. disclose the plurality of speech items includes a plurality of words, 
and wherein converting the sequence of text and pseudo words further comprises: 

parsing the sequence of text and pseudo words to determine any words (parsing 
includes matching words in the ad vocabulary table, column 4, lines 10-19). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. and Hata et al. to parse the sequence of text and 
pseudo words to determine any words, because using words as speech items (phonetic 
units) helps reduce the number of boundaries that occur and capture the coarticulary 
effects over a longer unit, resulting in higher quality sounding output speech. 

5. Claims 6 and 7 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Huang et al., in view of Hata et al., and further in view of Holm et al. (U.S. Patent 
5,850,629). 
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In regard to claim 6, Huang et al. disclose abbreviations, acronyms, character 
strings, and numerical strings are expanded into pseudo words (column 7, line 59 to 
column 8, line 12). 

However, Huang et al. and Hata et al. do not disclose searching the text for an 
Internet mail address and expanding any Internet mail address contained in the text 
data into at least one pseudo word. 

Holm et al. teach expanding abbreviations and acronyms (Fig. 10) as well as 
ways to handle e-mail addresses (column 14, lines 15-21). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. and Hata et al. to locate and expand email addresses 
(in, for example, the location of contact info step) in order to properly pronounce text 
containing abbreviations containing e-mail addresses and also detect sentence 
boundaries. 

In regard to claim 7, Huang et al. disclose abbreviations, acronyms, character 
strings, and numerical strings are expanded into pseudo words (column 7, line 59 to 
column 8, line 12). 

However Huang et al. and Hata et al. do not disclose searching the text data for 
an Internet Universal Resource Locator and expanding any Internet Universal Resource 
Locator in the text data into at least one pseudo word. 
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Holm et al. teach expanding abbreviations and acronyms (Fig. 10) as well as 
ways to handle e-mail addresses. Similar to e-mail, web addresses in URL format also 
constitute special abbreviations containing special characters, such as HTML tags. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Huang et al. and Hata et al. to locate and expand Internet Universal 
Resources Locators (in, for example, the location of contact info step), in order to 
properly pronounce text containing web addresses and other HTML related information. 



Conclusion 

6. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 
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7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian L. Albertalli whose telephone number is (571) 272- 
7616. The examiner can normally be reached on Mon - Fri, 8:00 AM - 5:30 PM, every 
second Fri off. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on (571 ) 272-7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system, Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 



Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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